Skip to content

fix(desktop): harden startup retries#247

Closed
Whiteks1 wants to merge 6 commits intomainfrom
codex/desktop-electron-p1-hardening
Closed

fix(desktop): harden startup retries#247
Whiteks1 wants to merge 6 commits intomainfrom
codex/desktop-electron-p1-hardening

Conversation

@Whiteks1
Copy link
Copy Markdown
Owner

@Whiteks1 Whiteks1 commented Apr 2, 2026

Summary

This PR hardens the Electron desktop shell startup path by:

  • retrying research_ui startup with fallback Python candidates
  • keeping workspace state persistence safe for tabs and launch state
  • restoring the desktop shell to a stable state after startup/reachability checks
  • preserving the desktop-focused boundary without changing broker or execution logic

Why

The desktop shell is the operator-facing bridge into QuantLab.
Startup reliability matters here because the shell is responsible for launching and reaching research_ui before the workspace can be used.

Scope

This PR does not:

  • change broker execution behavior
  • change paper-session behavior
  • change Hyperliquid or Kraken logic
  • introduce new runtime surfaces outside the desktop shell

Validation

Validated with:

  • node --check desktop/main.js
  • node --check desktop/renderer/app.js
  • node --check desktop/renderer/modules/utils.js
  • npm run smoke

Current note:

  • npm run smoke currently fails because research_ui did not become reachable during the smoke run.
  • That is the remaining issue to investigate for the desktop startup path.

Notes

  • the branch was updated against current main before opening this PR so the diff only contains the desktop hardening changes
  • the earlier stale-base deletions are not part of this PR

Summary by Sourcery

Harden the Electron desktop shell startup and refresh behavior to improve research_ui reliability and keep the UI stable under failures.

New Features:

  • Introduce multi-candidate Python resolution and retry logic for launching the managed research_ui server.
  • Add an LRU-based detail cache and configurable limits for detail entries and consecutive refresh errors in the renderer.

Bug Fixes:

  • Prevent stale process handles and listeners from affecting research_ui readiness and exit handling during restarts and retries.
  • Stop the snapshot refresh loop when the workspace server URL is unavailable or after repeated failures, resuming only on explicit retry.
  • Ensure workspace state subscriptions and timers are properly cleaned up on window unload to avoid leaks and inconsistent UI state.

Enhancements:

  • Improve error handling and user feedback for API unavailability, including messaging when automatic refresh is paused after repeated errors.
  • Enable Electron sandboxing for the renderer window to tighten security around the desktop shell.
  • Persist and reset snapshot status metadata, including consecutive error counts and refresh pause state, across runtime retry flows.

@sourcery-ai
Copy link
Copy Markdown

sourcery-ai bot commented Apr 2, 2026

Reviewer's Guide

Hardens the Electron desktop shell startup and renderer workspace behavior by introducing multi-candidate Python startup with retries, safer process binding, improved workspace refresh error handling, and a bounded LRU cache for details, while enabling renderer sandboxing.

Class diagram for the new LruCache utility

classDiagram
  class LruCache {
    -number maxSize
    -Map cache
    +LruCache(maxSize)
    +clear()
    +has(key)
    +get(key)
    +set(key, value)
  }
Loading

Flow diagram for snapshot refresh loop with error backoff and pause

flowchart TD
  A_start[Start refreshSnapshot] --> B_checkUrl{workspace.serverUrl set?}
  B_checkUrl -- no --> C_stopLoop[stopRefreshLoop and return]
  B_checkUrl -- yes --> D_fetch[Fetch runsIndex, details, errors, brokerHealth, stepbitWorkspace]
  D_fetch --> E_success{All required fetches succeed?}
  E_success -- yes --> F_updateSnapshot[Update state.snapshot with new data]
  F_updateSnapshot --> G_resetStatus[Set snapshotStatus: status ok, error null, lastSuccessAt now, consecutiveErrors 0, refreshPaused false]
  G_resetStatus --> H_render[renderWorkspaceState]

  E_success -- no --> I_errorPath[Handle fetch error]
  I_errorPath --> J_incErrors[consecutiveErrors = previous consecutiveErrors + 1]
  J_incErrors --> K_pauseCheck{consecutiveErrors >= maxConsecutiveRefreshErrors?}
  K_pauseCheck -- yes --> L_pause[Set refreshPaused true and stopRefreshLoop]
  K_pauseCheck -- no --> M_continue[Keep refresh loop running]
  L_pause --> N_setErrorStatus[Set snapshotStatus: status error, error message, lastSuccessAt unchanged, consecutiveErrors, refreshPaused]
  M_continue --> N_setErrorStatus
  N_setErrorStatus --> H_render

  H_render --> O_buildAlert[buildRuntimeAlert]
  O_buildAlert --> P_errorAlert{snapshotStatus.status is error?}
  P_errorAlert -- yes --> Q_showWarn[Show warn alert: API unavailable + optional Automatic refresh paused suffix]
  P_errorAlert -- no --> R_otherStates[Show other runtime alerts or none]

  subgraph User_retries
    S_userClick[User clicks Retry API] --> T_retryWorkspace[retryWorkspaceRuntime]
    T_retryWorkspace --> U_restartServer[Optionally restart workspace server]
    U_restartServer --> V_resetCounters[Reset snapshotStatus.consecutiveErrors to 0 and refreshPaused to false]
    V_resetCounters --> W_refreshOnce[Call refreshSnapshot]
    W_refreshOnce --> X_maybeLoop{workspace.serverUrl set and snapshotStatus.status not error?}
    X_maybeLoop -- yes --> Y_resumeLoop[ensureRefreshLoop resumes interval]
    X_maybeLoop -- no --> Z_noLoop[Do not resume loop]
  end
Loading

File-Level Changes

Change Details Files
Add multi-candidate Python interpreter resolution and retry logic for research_ui startup, with safer process lifecycle handling.
  • Replace single resolvePythonCommand with resolvePythonCandidates that builds and validates a prioritized list of Python commands (local venv, $PYTHON, system default).
  • Introduce retryResearchUiProcess and launchResearchUiProcess helpers to manage interpreter retries on spawn errors or non‑zero exits during startup.
  • Refactor process wiring into bindResearchUiProcess and guard stdout/stderr/exit/error handlers to ignore stale process handles.
  • Reset Python candidate state on server stop and surface a clear workspace error when no usable interpreter is found.
desktop/main.js
Improve renderer workspace lifecycle, snapshot refresh robustness, and limit in-memory detail caching.
  • Track and unsubscribe the workspace state listener on window unload and centralize refresh timer management via ensureRefreshLoop/stopRefreshLoop.
  • Augment snapshotStatus with consecutive error tracking and a refreshPaused flag, pausing the refresh loop after repeated failures and resuming on explicit retry.
  • Limit detail cache growth by replacing the Map-based detailCache with a fixed-size LruCache and adding its implementation to utils.js.
  • Enhance runtime alerts and retry behavior to communicate paused refresh and clear error state and restart the refresh loop on successful retry.
desktop/renderer/app.js
desktop/renderer/modules/utils.js
Tighten renderer security by enabling sandboxing for the BrowserWindow.
  • Switch BrowserWindow webPreferences.sandbox from false to true to run the renderer in a sandboxed environment.
desktop/main.js

Possibly linked issues

  • #(unassigned): PR adds Python retry logic, stricter reachability handling, and safer workspace persistence, directly addressing the desktop hardening issue.

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@Whiteks1 Whiteks1 closed this Apr 2, 2026
@Whiteks1 Whiteks1 deleted the codex/desktop-electron-p1-hardening branch April 4, 2026 16:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant